Parameter Optimization for Iterative Confusion Network Decoding in Weather-Domain Speech Recognition
نویسندگان
چکیده
In this paper, we apply a set of approaches to, efficiently, rescore the output of the automatic speech recognition over weather-domain data. Since the in-domain data is usually insufficient for training an accurate language model (LM) we utilize an automatic selection method to extract domain-related sentences from a general text resource. Then, an N-gram language model is trained on this set. We exploit this LM, along with a pre-trained acoustic model for recognition of the development and test instances. The recognizer generates a confusion network (CN) for each instance. Afterwards, we make use of the recurrent neural network language model (RNNLM), trained on the in-domain data, in order to iteratively rescore the CNs. Rescoring the CNs, in this way, requires estimating the weights of the RNNLM, N-gramLM and acoustic model scores. Weights optimization is the critical part of this work, whereby, we propose using the minimum error rate training (MERT) algorithm along with a novel Nbest list extraction method. The experiments are done over weather forecast domain data that has been provided in the framework of EUBRIDGE project.
منابع مشابه
Pseudo-morpheme and Confusion Network Based Korean-english Statistical Spoken Language Translation System
In this demonstration, we present POSSLT (POSTECH Spoken Language Translation) for a Korean-English statistical spoken language translation (SLT) system using pseudo-morpheme and confusion network (CN) based technique. Like most other SLT systems, automatic speech recognition (ASR) and machine translation (MT) are coupled in a cascading manner in our SLT system. We used confusion network based ...
متن کاملDirect word graph rescoring using a* search and RNNLM
The usage of Recurrent Neural Network Language Models (RNNLMs) has allowed reaching significant improvements in Automatic Speech Recognition (ASR) tasks. However, to take advantage of their capability for considering long histories, they are usually used to rescore the N-best lists (i.e. it is in practice not possible to use them directly during acoustic trellis search). We propose in this pape...
متن کاملSpoken Commands in a Smart Home: An Iterative Approach to the Sphinx Algorithm
An algorithm for decoding commands spoken in an intelligent environment through iterative vocabulary reduction is presented. Current research in the field of speech recognition focuses primarily on the optimization of algorithms for single pass decoding using large vocabularies. While this is ideal for processing conversational speech, alternative methods should be explored for different domain...
متن کاملAutomatic estimation of decoding parameters using large-margin iterative linear programming
The decoding parameters in automatic speech recognition — grammar factor and word insertion penalty — are usually determined by performing a grid search on a development set. Recently, we cast their estimation as a convex optimization problem, and proposed a solution using an iterative linear programming algorithm. However, the solution depends on how well the development data set matches with ...
متن کاملConfusion-based entropy-weighted decoding for robust speech recognition
An entropy-based feature parameter weighting scheme was proposed previously [1], in which the scores obtained from different feature parameters are weighted differently in the decoding process according to an entropy measure. In this paper, we propose a more delicate entropy measure for this purpose considering the inherent confusion among different acoustic classes. If a set of acoustic classe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013